------------------------------------------------------------------------------
 NINJA File Format Specifications                                 version 1.0
 Written by Derrick Sobodash                                   Copyright 2004
 Released on February 26, 2004                             http://d.the2d.com
------------------------------------------------------------------------------

 PURPOSE

  A blessing and bane of translating games is the IPS file format. While it
  may have been the international patching standard when files never
  exceeded 4MB, today it is completely unsuitable. PPF (Playstation Patch
  Format) was created to combat the 24-bit addressing limitations of IPS,
  but even PPF stops at 32-bit, which will one day be obsolete.

  Rather than being forced to upgrade to a new patching standard every few
  years, I have chosen to create a format of my own that will continue to
  be useable for at least another decade.

  Hence, the NINJA file formats were born.

------------------------------------------------------------------------------

 METHOD

  NINJA is based on two methods of file storage: textual and binary. A NINJA
  patch can be either of these types. Aside from storing the patch, some
  extra information is stored, including the CRC32, MD5 and SHA1 of the
  source file. The format of the source file is also stored -- this being
  the system it is for.

  Certain systems, such as the SNES, can have incredible variance in a file
  through lack or presence of header, interleaving methods, and splitting,
  while still being a good dump of the data. By storing the system, NINJA
  can harness extra functions to prepare that data so patches made with
  NINJA will work with any format of the original ROM. This should cut down
  on the number of emails from users who never think to check the CRC of
  their ROM against the one in the readme (if an author was insightful
  enough to supply one).

  Before beginning a comparison, you must convert the input files to
  whichever method is specified by the user. The standard for data is always
  no header, deinterleaved and decrypted for every system.

------------------------------------------------------------------------------

 STRUCTURE

  The standard extension for a NINJA patch is ".RUP"

  Every NINJA patch must begin with the following header:
    [5 bytes ] .......     NINJA     //ASCII text
    [1 byte  ] .......     .......   //VERSION # (in decimal)
    [2 bytes ] .......     .......   //PATCH ID, valid modes follow
              0x4220      "B "       // Binary format
              0x425a      "BZ"       // Binary + Gzip format
              0x540d      "T\n"      // Textual format
              0x545a      "TZ"       // Textual + Gzip format
  With formats supporting Gzip compression, the entire contents of the file
  following these format bytes will be a Gzip of patch. This should be
  decompressed to memory.

  Here is where textual format and binary format fork.

  Binary format header:
  
    [1 byte  ] .......     .......   //File format of the source file
                                     //Please see the VALID FORMATS section
                                     //for supported values.
    [4 bytes ] .......     .......   //CRC32 of the source file in binary
                                     //A value of "0" will skip this check
    [16 bytes] .......     .......   //MD5sum of the source file in binary
                                     //A value of "0" will skip this check
    [20 bytes] .......     .......   //SHA1 of the source file in binary
                                     //A value of "0" will skip this check

  Binary format patches:
    [1 byte  ] .......     .......   //# of bytes in the offset (x)
    [x bytes ] .......     .......   //Offset to patch at in big endian
    [1 byte  ] .......     .......   //# of bytes in the patch length (y)
    [y bytes ] .......     .......   //Length of the patch in big endian
    [* bytes ] .......     .......   //The patch (in binary)

  Binary format footer:
    [1 byte  ] 0x03        .......   //Since we process the file looking at
                                     //one byte to tell us how many to read
                                     //for the offset, we say 3 here.
    [3 bytes ] .......     EOF       //If you end up with "EOF" in your
                                     //offset, the patch is done

  Textual format header:
    [1 lines ] FORMAT CRC32 MD5SUM SHA1\n
                                     //CRC32 MD5SUM and SHA1 will accept a
                                     //value of "unk." It will bypass that
                                     //check when encountered.

  Textual format patches:
    [1 line  ] OFFSET PATCH_BYTES

  Textual format footer:
    (not applicable)

------------------------------------------------------------------------------

 VALID FORMATS

  This is the list of data formats recognized in v1.0 NINJA files. The
  number on the left is the decimal value to be used in binary patch format.

     0 raw,   // RAW data, no special processing
     1 nes,   // Nintendo Entertainment System/Famicom 8-bit
     2 snes,  // Super Nintendo Entertainment System/Super Famicom 16-bit
     3 n64,   // Nintendo 64
     4 gb,    // Game Boy
     5 gbc,   // Game Boy Color
     6 gba,   // Game Boy Advance
     7 ngp,   // NeoGeo Pocket
     8 ngpc,  // NeoGeo Pocket Color
     9 sms,   // Sega Master System
    10 gg,    // Game Gear
    11 mega,  // Genesis Megadrive
    12 pce,   // NEC TurboGrafx16/PC-Engine
    13 ws,    // Bandai WonderSwan
    14 wsc,   // Bandai WonderSwan Color/Crystal
    15 lynx,  // Atari Lynx
    16 jag,   // Atari Jaguar
    17 gp32   // Gamepark GP32

------------------------------------------------------------------------------

 LARGE FILES

  Because reading large files into RAM is not practical as of Version 1.0 of
  this document, we can't very well get a CRC32, MD5sum or SHA1 of a 100MB
  file without considerable lag time. Therefore, for NINJA 1.0, this is how
  we handle RAW files over 0x1e00000 bytes:

  Read in the first 0x1400000 bytes of the file, append it with the last
  0xa00000, then append that with the file size of the source (in decimal).
  Then use that string to make your CRC32, MD5sum and SHA1.

  There's a slight chance of error in detecting a bad file when applying the
  patch, but probably not enough to worry about. This should at least take
  care of anyone trying to patch something made off a CUE/BIN over an FCD or
  Alcohol 120% MDF/MDS image.

------------------------------------------------------------------------------

 SYSTEM SPECIFIC

  While there are some great documents out there for detecting system
  formats, here are the big three for NINJA 1.0. Anything your
  implementation of NINJA v1.0 does not have a function supporting, you
  should treat as RAW and MUST write "RAW" for the header for format, so
  all NINJA compatible patchers will treat it the same. If you encounter a
  patch set to a mode you do not support, print an error message and exit.

 SUPER NINTENDO

  SNES ROMs which have headers have an extra 0x200 bytes on the front of
  them. Once quick way to find out if a ROM has a header or not is to look
  at the checksum and inverse checksum (both 16bit). This will also help
  you detect LoROM/HiROM and interleaving.

  First read in 16 bits from 0x7fdc and add them to the 16 from 0x7fde. If
  you get 0xffff, you have no header. If it's garbage, try adding 0x200 to
  those offsets and see if you get 0xffff. If you do, there's a header. Now
  read in the lower nibble (bottom four bits) from 0x7fd5. If this is odd,
  you have an interleaved HiROM game. If it's even, you have a LoROM game.
  If you advanced 0x200 before, do it again for this offset.

  If advancing 0x200 yielded nothing, then take a look at 0xffdc. Take the
  16 bits from there and add it to the 16 from 0xffde and see if you get
  0xffff. If not, at 0x200 to the offsets check again.

  If your ROM is interleaved HiROM, you need to deinterleave it. For most
  ROM size, this is simple. Split the ROM into 0x8000 byte chunks. Put all
  the odd numbered chunks (assume 0 is your first chunk and 1 is second).
  first in the ROM, then write all the even chunks. If the ROM is 20Mbit
  or 24Mbit, however, there's a different interleave ordering. The pattern
  is shown below in chunk numbers:

  20Mbit (0x280000 bytes):
     1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35,
    37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71,
    73, 75, 77, 79, 64, 66, 68, 70, 72, 74, 76, 78, 32, 34, 36, 38, 40, 42,
    44, 46, 48, 50, 52, 54, 56, 58, 60, 62,  0,  2,  4,  6,  8, 10, 12, 14,
    16, 18, 20, 22, 24, 26, 28, 30

  24Mbit (0x300000 bytes):
     1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35,
    37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71,
    73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 64, 66, 68, 70, 72, 74,
    76, 78, 80, 82, 84, 86, 88, 90, 92, 94,  0,  2,  4,  6,  8, 10, 12, 14,
    16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50,
    52, 54, 56, 58, 60, 62

  SEGA MEGADRIVE

  Sega Megadrive is a bit easier. We only have two formats: BIN or SMD. BIN
  is what we want everything to be.

  Seek to 0x100 bytes and read in the first four bytes. If they are "SEGA"
  in ASCII, you have a BIN ROM and are all done. If you get "SG" and two
  bytes of JUNK, you probably have SMD. To be extra sure, read two bytes
  from 0x8. If it's 0xaabb, you have an SMD. SMD has an 0x200 byte header
  so seek to 0x200.

  We deinterleave the ROM in 0x4000 (16KB) chunks. Because the SMD
  interleaving is SO weird, I'm going to provide you with a block of code to
  deinterleave a 16KB chunk:

    $low   = 1;
    $high  = 0;
    $block = "";
    for($i=0; $i<0x2000; $i++) {
      $block[$low] = $chunk[$i];
      $block[$high] = $chunk[(0x2000+$i)];
      $low  = $o+2;
      $high = $e+2;
    }

  $block will now contain the deinterleaved chunk.

  NINTENDO GAME BOY

  Game Boy is very simple, we just have to worry about the archaic
  "Smart Card"(tm) headers. The headers are, surprise surprise, 0x200 bytes.
  Just mod the file size by 0x4000 (16KB, the smallest Game Boy ROM size).
  If you get 0, you have no header, if you get anything else, you do.

------------------------------------------------------------------------------